Aspera | 1000 Genomes

How to download ENA files using aspera?

Answer:

The International Genome Sample Resource (IGSR) has stopped mirroring sequence files from the ENA but instead using the sequence.index files to point to the FTP location for the fastq file.

e.g ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR008/ERR008901/ERR008901_1.fastq.gz

These files can also be downloaded using aspera. You will need to get the ascp program as described in how to download files using aspera

Then you will need to change the ENA FTP host to the ENA Aspera host.

This means you need to change the FTP url to something suitable for the ascp command:

e.g ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR008/ERR008901/ERR008901_1.fastq.gz

becomes

fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR008/ERR008901/ERR008901_1.fastq.gz

You aspera command would need to look like

 ascp -i bin/aspera/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -L- fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR008/ERR008901/ERR008901_1.fastq.gz ./

Further details

For further information, please contact info@1000genomes.org. Full documentation about how to use aspera to download files from the ENA please see their document Downloading sequence files

How to download files using aspera?

Answer:

Download Aspera

Aspera provides a fast method of downloading data. To use the Aspera service you need to download the Aspera connect software. This provides a bulk download client called ascp.

Command line

For the command line tool ascp, for versions 3.3.3 and newer, you need to use a command line like:

     ascp -i bin/aspera/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -P33001 -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ./

For versions 3.3.2 and older, you need to use a command line like:

     ascp -i bin/aspera/etc/asperaweb_id_dsa.putty -Tr -Q -l 100M -P33001 -L- fasp-g1k@fasp.1000genomes.ebi.ac.uk:vol1/ftp/release/20100804/ALL.2of4intersection.20100804.genotypes.vcf.gz ./

Note, the only change between these commands is that for newer versions of ascp asperaweb_id_dsa.openssh replaces asperaweb_id_dsa.putty. This change is noted by Aspera here. You can check the version of ascp you have using:

   ascp --version

The argument to -i may also be different depending on the location of the default key file. The command should not ask you for a password. All the IGSR data is accessible without a password but you do need to give ascp the ssh key to complete the command.

Files on the ENA FTP

Some of the data we provide URLs for is hosted on the ENA FTP site. ENA provide information on using Aspera with their FTP site.

As an example of downloading a file from ENA, you could use a command line like:

ascp -i bin/aspera/etc/asperaweb_id_dsa.openssh -Tr -Q -l 100M -P33001 -L- 
era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/ERR008/ERR008901/ERR008901_1.fastq.gz ./

Key files

If you are unsure of the location of asperaweb_id_dsa.openssh or asperaweb_id_dsa.putty, Aspera provide some documentation on where these will be found on different systems.

Ports

For the above commands to work with your network’s firewall you need to open ports 22/tcp (outgoing) and 33001/udp (both incoming and outgoing) to the following EBI IPs:

193.62.192.6
193.62.193.6
193.62.193.135

If the firewall has UDP flood protection, it must be turned off for port 33001.

Browser

Our aspera browser interace no longer works. If you wish to download files using a web interface we recommend using the Globus interface we present. If you are previously relied on the aspera web interface and wish to discuss the matter please email us at info@1000genomes.org to discuss your options.

Further details

For further information, please contact info@1000genomes.org.

What tools can I use to download 1000 Genomes data?

Answer:

The 1000 Genomes data is available via ftp, http and Aspera. Any standard tool like wget or ftp should be able to download from our ftp or http mounted sites. To use Aspera you need to download their client.

IGSR: The International Genome Sample Resource

Supporting open human variation data

Links

How to download ENA files using aspera?

Answer:

Further details

Related questions:

How to download files using aspera?

Answer:

Download Aspera

Command line

Files on the ENA FTP

Key files

Ports

Browser

Further details

Related questions:

What tools can I use to download 1000 Genomes data?

Answer:

Related questions: